Overview

Dataset statistics

Number of variables13
Number of observations160
Missing cells50
Missing cells (%)2.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.4 KiB
Average record size in memory104.8 B

Variable types

Numeric10
Categorical3

Alerts

address is highly correlated with longitude and 1 other fieldsHigh correlation
price is highly correlated with bedrooms and 2 other fieldsHigh correlation
bedrooms is highly correlated with price and 2 other fieldsHigh correlation
longitude is highly correlated with addressHigh correlation
latitude is highly correlated with addressHigh correlation
bathrooms is highly correlated with price and 2 other fieldsHigh correlation
livingArea is highly correlated with price and 2 other fieldsHigh correlation
address is highly correlated with longitude and 1 other fieldsHigh correlation
price is highly correlated with bedrooms and 2 other fieldsHigh correlation
bedrooms is highly correlated with price and 2 other fieldsHigh correlation
longitude is highly correlated with addressHigh correlation
latitude is highly correlated with addressHigh correlation
bathrooms is highly correlated with price and 2 other fieldsHigh correlation
livingArea is highly correlated with price and 2 other fieldsHigh correlation
price is highly correlated with bedrooms and 2 other fieldsHigh correlation
bedrooms is highly correlated with price and 2 other fieldsHigh correlation
bathrooms is highly correlated with price and 2 other fieldsHigh correlation
livingArea is highly correlated with price and 2 other fieldsHigh correlation
propertyType is highly correlated with lotAreaValue and 6 other fieldsHigh correlation
lotAreaValue is highly correlated with propertyType and 2 other fieldsHigh correlation
address is highly correlated with longitude and 1 other fieldsHigh correlation
price is highly correlated with propertyType and 3 other fieldsHigh correlation
bedrooms is highly correlated with propertyType and 3 other fieldsHigh correlation
longitude is highly correlated with propertyType and 3 other fieldsHigh correlation
latitude is highly correlated with propertyType and 2 other fieldsHigh correlation
bathrooms is highly correlated with propertyType and 3 other fieldsHigh correlation
livingArea is highly correlated with propertyType and 3 other fieldsHigh correlation
lotAreaUnit is highly correlated with lotAreaValueHigh correlation
bedrooms has 6 (3.8%) missing values Missing
longitude has 15 (9.4%) missing values Missing
latitude has 15 (9.4%) missing values Missing
bathrooms has 6 (3.8%) missing values Missing
livingArea has 6 (3.8%) missing values Missing
zpid has unique values Unique
Unnamed: 0 has 4 (2.5%) zeros Zeros
lotAreaValue has 15 (9.4%) zeros Zeros

Reproduction

Analysis started2022-07-23 02:05:19.085802
Analysis finished2022-07-23 02:05:28.615325
Duration9.53 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

ZEROS

Distinct40
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.5
Minimum0
Maximum39
Zeros4
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:28.660686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.95
Q19.75
median19.5
Q329.25
95-th percentile37.05
Maximum39
Range39
Interquartile range (IQR)19.5

Descriptive statistics

Standard deviation11.57963947
Coefficient of variation (CV)0.5938276653
Kurtosis-1.201452169
Mean19.5
Median Absolute Deviation (MAD)10
Skewness0
Sum3120
Variance134.0880503
MonotonicityNot monotonic
2022-07-22T22:05:28.725895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
04
 
2.5%
14
 
2.5%
224
 
2.5%
234
 
2.5%
244
 
2.5%
254
 
2.5%
264
 
2.5%
274
 
2.5%
284
 
2.5%
294
 
2.5%
Other values (30)120
75.0%
ValueCountFrequency (%)
04
2.5%
14
2.5%
24
2.5%
34
2.5%
44
2.5%
54
2.5%
64
2.5%
74
2.5%
84
2.5%
94
2.5%
ValueCountFrequency (%)
394
2.5%
384
2.5%
374
2.5%
364
2.5%
354
2.5%
344
2.5%
334
2.5%
324
2.5%
314
2.5%
304
2.5%

propertyType
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
SINGLE_FAMILY
107 
TOWNHOUSE
32 
CONDO
14 
LOT
 
6
APARTMENT
 
1

Length

Max length13
Median length13
Mean length11.1
Min length3

Characters and Unicode

Total characters1,776
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st rowSINGLE_FAMILY
2nd rowSINGLE_FAMILY
3rd rowSINGLE_FAMILY
4th rowSINGLE_FAMILY
5th rowSINGLE_FAMILY

Common Values

ValueCountFrequency (%)
SINGLE_FAMILY107
66.9%
TOWNHOUSE32
 
20.0%
CONDO14
 
8.8%
LOT6
 
3.8%
APARTMENT1
 
0.6%

Length

2022-07-22T22:05:28.788554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-22T22:05:28.844438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
single_family107
66.9%
townhouse32
 
20.0%
condo14
 
8.8%
lot6
 
3.8%
apartment1
 
0.6%

Most occurring characters

ValueCountFrequency (%)
L220
12.4%
I214
12.0%
N154
8.7%
E140
 
7.9%
S139
 
7.8%
A109
 
6.1%
M108
 
6.1%
Y107
 
6.0%
F107
 
6.0%
_107
 
6.0%
Other values (10)371
20.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1669
94.0%
Connector Punctuation107
 
6.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L220
13.2%
I214
12.8%
N154
9.2%
E140
8.4%
S139
8.3%
A109
6.5%
M108
6.5%
Y107
6.4%
F107
6.4%
G107
6.4%
Other values (9)264
15.8%
Connector Punctuation
ValueCountFrequency (%)
_107
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1669
94.0%
Common107
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L220
13.2%
I214
12.8%
N154
9.2%
E140
8.4%
S139
8.3%
A109
6.5%
M108
6.5%
Y107
6.4%
F107
6.4%
G107
6.4%
Other values (9)264
15.8%
Common
ValueCountFrequency (%)
_107
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1776
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L220
12.4%
I214
12.0%
N154
8.7%
E140
 
7.9%
S139
 
7.8%
A109
 
6.1%
M108
 
6.1%
Y107
 
6.0%
F107
 
6.0%
_107
 
6.0%
Other values (10)371
20.9%

lotAreaValue
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct80
Distinct (%)50.3%
Missing1
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean3003.929572
Minimum0
Maximum10454.4
Zeros15
Zeros (%)9.4%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:28.910316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.4515
median1306.8
Q36468.66
95-th percentile9147.6
Maximum10454.4
Range10454.4
Interquartile range (IQR)6468.2085

Descriptive statistics

Standard deviation3597.029169
Coefficient of variation (CV)1.197441246
Kurtosis-1.121314935
Mean3003.929572
Median Absolute Deviation (MAD)1306.55
Skewness0.7253657062
Sum477624.802
Variance12938618.84
MonotonicityNot monotonic
2022-07-22T22:05:28.974730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015
 
9.4%
8276.47
 
4.4%
1742.47
 
4.4%
87126
 
3.8%
1306.86
 
3.8%
65345
 
3.1%
9147.65
 
3.1%
5227.24
 
2.5%
0.34
 
2.5%
7840.84
 
2.5%
Other values (70)96
60.0%
ValueCountFrequency (%)
015
9.4%
0.253
 
1.9%
0.271
 
0.6%
0.281
 
0.6%
0.292
 
1.2%
0.34
 
2.5%
0.311
 
0.6%
0.321
 
0.6%
0.332
 
1.2%
0.342
 
1.2%
ValueCountFrequency (%)
10454.42
 
1.2%
10018.81
 
0.6%
100181
 
0.6%
9583.22
 
1.2%
9321.841
 
0.6%
9147.65
3.1%
8929.81
 
0.6%
87126
3.8%
8624.881
 
0.6%
8276.47
4.4%

address
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct31
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27570.51875
Minimum27502
Maximum27617
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.036220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum27502
5-th percentile27502
Q127528.25
median27587
Q327610
95-th percentile27616
Maximum27617
Range115
Interquartile range (IQR)81.75

Descriptive statistics

Standard deviation41.94185431
Coefficient of variation (CV)0.001521257351
Kurtosis-1.568390583
Mean27570.51875
Median Absolute Deviation (MAD)28
Skewness-0.3586949912
Sum4411283
Variance1759.119143
MonotonicityNot monotonic
2022-07-22T22:05:29.090878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
2758716
 
10.0%
2750211
 
6.9%
2761610
 
6.2%
2761510
 
6.2%
275269
 
5.6%
276039
 
5.6%
275299
 
5.6%
275408
 
5.0%
276128
 
5.0%
275198
 
5.0%
Other values (21)62
38.8%
ValueCountFrequency (%)
2750211
6.9%
275116
3.8%
275135
3.1%
275181
 
0.6%
275198
5.0%
275269
5.6%
275299
5.6%
275395
3.1%
275408
5.0%
275453
 
1.9%
ValueCountFrequency (%)
276174
 
2.5%
2761610
6.2%
2761510
6.2%
276143
 
1.9%
276132
 
1.2%
276128
5.0%
276107
4.4%
276092
 
1.2%
276081
 
0.6%
276072
 
1.2%

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct126
Distinct (%)78.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean532569.3688
Minimum1795
Maximum2737500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.154322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1795
5-th percentile80850
Q1358750
median485500
Q3625591
95-th percentile1101844.5
Maximum2737500
Range2735705
Interquartile range (IQR)266841

Descriptive statistics

Standard deviation326308.1213
Coefficient of variation (CV)0.6127053871
Kurtosis13.7749377
Mean532569.3688
Median Absolute Deviation (MAD)135500
Skewness2.67972826
Sum85211099
Variance1.0647699 × 1011
MonotonicityNot monotonic
2022-07-22T22:05:29.219440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000005
 
3.1%
3500003
 
1.9%
4400003
 
1.9%
3800003
 
1.9%
4900003
 
1.9%
3950003
 
1.9%
808503
 
1.9%
3050003
 
1.9%
4100002
 
1.2%
6600002
 
1.2%
Other values (116)130
81.2%
ValueCountFrequency (%)
17951
 
0.6%
23001
 
0.6%
23451
 
0.6%
23951
 
0.6%
590001
 
0.6%
770001
 
0.6%
808503
1.9%
1700001
 
0.6%
2090001
 
0.6%
2249211
 
0.6%
ValueCountFrequency (%)
27375001
0.6%
17000001
0.6%
16250001
0.6%
13190001
0.6%
13000001
0.6%
12500001
0.6%
11600001
0.6%
11500001
0.6%
10993101
0.6%
10100001
0.6%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct6
Distinct (%)3.9%
Missing6
Missing (%)3.8%
Infinite0
Infinite (%)0.0%
Mean3.396103896
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.272244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8738677542
Coefficient of variation (CV)0.2573147881
Kurtosis0.5721530414
Mean3.396103896
Median Absolute Deviation (MAD)1
Skewness0.3233911849
Sum523
Variance0.7636448519
MonotonicityNot monotonic
2022-07-22T22:05:29.318078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
367
41.9%
455
34.4%
219
 
11.9%
59
 
5.6%
63
 
1.9%
11
 
0.6%
(Missing)6
 
3.8%
ValueCountFrequency (%)
11
 
0.6%
219
 
11.9%
367
41.9%
455
34.4%
59
 
5.6%
63
 
1.9%
ValueCountFrequency (%)
63
 
1.9%
59
 
5.6%
455
34.4%
367
41.9%
219
 
11.9%
11
 
0.6%

longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct144
Distinct (%)99.3%
Missing15
Missing (%)9.4%
Infinite0
Infinite (%)0.0%
Mean-78.68404174
Minimum-78.94407
Maximum-78.33526
Zeros0
Zeros (%)0.0%
Negative145
Negative (%)90.6%
Memory size1.4 KiB
2022-07-22T22:05:29.373749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-78.94407
5-th percentile-78.902798
Q1-78.79274
median-78.67654
Q3-78.56809
95-th percentile-78.48577
Maximum-78.33526
Range0.60881
Interquartile range (IQR)0.22465

Descriptive statistics

Standard deviation0.136191241
Coefficient of variation (CV)-0.001730862294
Kurtosis-0.9080534072
Mean-78.68404174
Median Absolute Deviation (MAD)0.11395
Skewness0.09274693872
Sum-11409.18605
Variance0.01854805412
MonotonicityNot monotonic
2022-07-22T22:05:29.435241image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-78.831272
 
1.2%
-78.648641
 
0.6%
-78.548151
 
0.6%
-78.568091
 
0.6%
-78.541441
 
0.6%
-78.669991
 
0.6%
-78.558791
 
0.6%
-78.708831
 
0.6%
-78.6235661
 
0.6%
-78.5011
 
0.6%
Other values (134)134
83.8%
(Missing)15
 
9.4%
ValueCountFrequency (%)
-78.944071
0.6%
-78.924651
0.6%
-78.9215241
0.6%
-78.913841
0.6%
-78.908271
0.6%
-78.903821
0.6%
-78.90361
0.6%
-78.9034651
0.6%
-78.900131
0.6%
-78.897011
0.6%
ValueCountFrequency (%)
-78.335261
0.6%
-78.419081
0.6%
-78.4277651
0.6%
-78.456691
0.6%
-78.4623951
0.6%
-78.4652941
0.6%
-78.469731
0.6%
-78.485461
0.6%
-78.487011
0.6%
-78.488181
0.6%

latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct144
Distinct (%)99.3%
Missing15
Missing (%)9.4%
Infinite0
Infinite (%)0.0%
Mean35.79074312
Minimum35.558064
Maximum36.02089
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.498600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum35.558064
5-th percentile35.6108988
Q135.725338
median35.788887
Q335.874184
95-th percentile35.9572302
Maximum36.02089
Range0.462826
Interquartile range (IQR)0.148846

Descriptive statistics

Standard deviation0.1076369566
Coefficient of variation (CV)0.003007396527
Kurtosis-0.8314627568
Mean35.79074312
Median Absolute Deviation (MAD)0.080749
Skewness-0.07931923467
Sum5189.657753
Variance0.01158571442
MonotonicityNot monotonic
2022-07-22T22:05:29.565236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35.785222
 
1.2%
35.7253721
 
0.6%
36.020891
 
0.6%
35.9214251
 
0.6%
35.813631
 
0.6%
35.8751071
 
0.6%
35.8282131
 
0.6%
35.5580641
 
0.6%
35.9627271
 
0.6%
35.645611
 
0.6%
Other values (134)134
83.8%
(Missing)15
 
9.4%
ValueCountFrequency (%)
35.5580641
0.6%
35.5768131
0.6%
35.587961
0.6%
35.5914651
0.6%
35.6045841
0.6%
35.605921
0.6%
35.605991
0.6%
35.6060641
0.6%
35.6302381
0.6%
35.631661
0.6%
ValueCountFrequency (%)
36.020891
0.6%
35.9993361
0.6%
35.979541
0.6%
35.97871
0.6%
35.9772151
0.6%
35.965331
0.6%
35.9627271
0.6%
35.9573821
0.6%
35.9566231
0.6%
35.9562951
0.6%

listingStatus
Categorical

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
RECENTLY_SOLD
154 
PENDING
 
6

Length

Max length13
Median length13
Mean length12.775
Min length7

Characters and Unicode

Total characters2,044
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRECENTLY_SOLD
2nd rowRECENTLY_SOLD
3rd rowRECENTLY_SOLD
4th rowRECENTLY_SOLD
5th rowRECENTLY_SOLD

Common Values

ValueCountFrequency (%)
RECENTLY_SOLD154
96.2%
PENDING6
 
3.8%

Length

2022-07-22T22:05:29.625714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-22T22:05:29.679569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
recently_sold154
96.2%
pending6
 
3.8%

Most occurring characters

ValueCountFrequency (%)
E314
15.4%
L308
15.1%
N166
8.1%
D160
7.8%
R154
7.5%
C154
7.5%
T154
7.5%
Y154
7.5%
_154
7.5%
S154
7.5%
Other values (4)172
8.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1890
92.5%
Connector Punctuation154
 
7.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E314
16.6%
L308
16.3%
N166
8.8%
D160
8.5%
R154
8.1%
C154
8.1%
T154
8.1%
Y154
8.1%
S154
8.1%
O154
8.1%
Other values (3)18
 
1.0%
Connector Punctuation
ValueCountFrequency (%)
_154
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1890
92.5%
Common154
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
E314
16.6%
L308
16.3%
N166
8.8%
D160
8.5%
R154
8.1%
C154
8.1%
T154
8.1%
Y154
8.1%
S154
8.1%
O154
8.1%
Other values (3)18
 
1.0%
Common
ValueCountFrequency (%)
_154
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2044
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E314
15.4%
L308
15.1%
N166
8.1%
D160
7.8%
R154
7.5%
C154
7.5%
T154
7.5%
Y154
7.5%
_154
7.5%
S154
7.5%
Other values (4)172
8.4%

zpid
Real number (ℝ≥0)

UNIQUE

Distinct160
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean300025725.4
Minimum6382246
Maximum2090244978
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.731477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6382246
5-th percentile6405195.1
Q16520552.5
median89497624.5
Q3133218598
95-th percentile2065570936
Maximum2090244978
Range2083862732
Interquartile range (IQR)126698045.5

Descriptive statistics

Standard deviation616728370.3
Coefficient of variation (CV)2.055584965
Kurtosis4.456485512
Mean300025725.4
Median Absolute Deviation (MAD)82964892
Skewness2.49740194
Sum4.800411607 × 1010
Variance3.803538828 × 1017
MonotonicityNot monotonic
2022-07-22T22:05:29.794610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1193412371
 
0.6%
1208744371
 
0.6%
3368674861
 
0.6%
3368680621
 
0.6%
501208211
 
0.6%
534579691
 
0.6%
603965611
 
0.6%
603992821
 
0.6%
605373271
 
0.6%
620383431
 
0.6%
Other values (150)150
93.8%
ValueCountFrequency (%)
63822461
0.6%
63915151
0.6%
63927331
0.6%
63945531
0.6%
63962321
0.6%
64010451
0.6%
64040511
0.6%
64047411
0.6%
64052191
0.6%
64066511
0.6%
ValueCountFrequency (%)
20902449781
0.6%
20794684961
0.6%
20687474281
0.6%
20685785211
0.6%
20684983861
0.6%
20678050811
0.6%
20667974281
0.6%
20657979461
0.6%
20655589881
0.6%
20642330451
0.6%

bathrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)6.5%
Missing6
Missing (%)3.8%
Infinite0
Infinite (%)0.0%
Mean3.11038961
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.845441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q33
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9007018927
Coefficient of variation (CV)0.2895784791
Kurtosis2.988494071
Mean3.11038961
Median Absolute Deviation (MAD)0
Skewness1.016682133
Sum479
Variance0.8112638995
MonotonicityNot monotonic
2022-07-22T22:05:29.885518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
391
56.9%
223
 
14.4%
422
 
13.8%
58
 
5.0%
13
 
1.9%
2.52
 
1.2%
62
 
1.2%
3.51
 
0.6%
71
 
0.6%
1.51
 
0.6%
(Missing)6
 
3.8%
ValueCountFrequency (%)
13
 
1.9%
1.51
 
0.6%
223
 
14.4%
2.52
 
1.2%
391
56.9%
3.51
 
0.6%
422
 
13.8%
58
 
5.0%
62
 
1.2%
71
 
0.6%
ValueCountFrequency (%)
71
 
0.6%
62
 
1.2%
58
 
5.0%
422
 
13.8%
3.51
 
0.6%
391
56.9%
2.52
 
1.2%
223
 
14.4%
1.51
 
0.6%
13
 
1.9%

livingArea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct151
Distinct (%)98.1%
Missing6
Missing (%)3.8%
Infinite0
Infinite (%)0.0%
Mean2322.012987
Minimum729
Maximum6481
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2022-07-22T22:05:29.938344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum729
5-th percentile1064.3
Q11681.5
median2191
Q32741.75
95-th percentile3999.5
Maximum6481
Range5752
Interquartile range (IQR)1060.25

Descriptive statistics

Standard deviation960.4174269
Coefficient of variation (CV)0.4136141495
Kurtosis2.984336909
Mean2322.012987
Median Absolute Deviation (MAD)537
Skewness1.266531835
Sum357590
Variance922401.6338
MonotonicityNot monotonic
2022-07-22T22:05:30.000566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16562
 
1.2%
18602
 
1.2%
24712
 
1.2%
28491
 
0.6%
9181
 
0.6%
17341
 
0.6%
31151
 
0.6%
13691
 
0.6%
24911
 
0.6%
28591
 
0.6%
Other values (141)141
88.1%
(Missing)6
 
3.8%
ValueCountFrequency (%)
7291
0.6%
7341
0.6%
9181
0.6%
9501
0.6%
9901
0.6%
10081
0.6%
10211
0.6%
10631
0.6%
10651
0.6%
10821
0.6%
ValueCountFrequency (%)
64811
0.6%
59751
0.6%
53561
0.6%
45511
0.6%
45111
0.6%
44961
0.6%
42141
0.6%
40581
0.6%
39681
0.6%
38371
0.6%

lotAreaUnit
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)1.3%
Missing1
Missing (%)0.6%
Memory size1.4 KiB
sqft
100 
acres
59 

Length

Max length5
Median length4
Mean length4.371069182
Min length4

Characters and Unicode

Total characters695
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowacres
2nd rowsqft
3rd rowsqft
4th rowsqft
5th rowsqft

Common Values

ValueCountFrequency (%)
sqft100
62.5%
acres59
36.9%
(Missing)1
 
0.6%

Length

2022-07-22T22:05:30.061911image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-22T22:05:30.110910image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sqft100
62.9%
acres59
37.1%

Most occurring characters

ValueCountFrequency (%)
s159
22.9%
q100
14.4%
f100
14.4%
t100
14.4%
a59
 
8.5%
c59
 
8.5%
r59
 
8.5%
e59
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter695
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s159
22.9%
q100
14.4%
f100
14.4%
t100
14.4%
a59
 
8.5%
c59
 
8.5%
r59
 
8.5%
e59
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
Latin695
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s159
22.9%
q100
14.4%
f100
14.4%
t100
14.4%
a59
 
8.5%
c59
 
8.5%
r59
 
8.5%
e59
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s159
22.9%
q100
14.4%
f100
14.4%
t100
14.4%
a59
 
8.5%
c59
 
8.5%
r59
 
8.5%
e59
 
8.5%

Interactions

2022-07-22T22:05:27.752543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.490980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.142594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.739594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.302404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.828214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.339783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.860893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.371860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.246530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.804434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.589162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.207298image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.804811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.359843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.881168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.395936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.913829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.432669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.301859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.854423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.653071image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.265998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.858635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.412670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.929162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.447767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.962685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.480705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.350694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.905321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.715552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.327182image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.916437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.465868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.983010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.500788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.014620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.899608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.403737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.954157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.777147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.385240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.974373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.518337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.037464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.551430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.073001image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.951779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.453567image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:28.003989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.835579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.443511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.028713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.568240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.089345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.603417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.124772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.002177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.504400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:28.054941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.898913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.503294image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.084223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.620385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.141228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.656317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.177406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.052011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.555093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:28.103325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:22.960794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.565691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.138528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.671943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.187190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.709237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.225201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.097921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.603114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:28.152165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.018072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.624418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.193312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.723305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.236616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.759783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.271044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.145336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.651070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:28.205781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.083522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:23.684270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.251396image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:24.776819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.290309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:25.812379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:26.321942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.197517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-22T22:05:27.702708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-07-22T22:05:30.155764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-22T22:05:30.240670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-22T22:05:30.322474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-22T22:05:30.397190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-22T22:05:30.684449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-22T22:05:28.297442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-22T22:05:28.410280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-22T22:05:28.498530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-22T22:05:28.565914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0propertyTypelotAreaValueaddresspricebedroomslongitudelatitudelistingStatuszpidbathroomslivingArealotAreaUnit
00SINGLE_FAMILY0.25275876750004.0-78.50100035.962727RECENTLY_SOLD1193412374.02849.0acres
11SINGLE_FAMILY9147.60275876250004.0-78.48701035.941480RECENTLY_SOLD1208744374.02741.0sqft
22SINGLE_FAMILY6098.002750223003.0-78.85261035.725338RECENTLY_SOLD1220671902.51819.0sqft
33SINGLE_FAMILY10018.80276137850004.0-78.71117035.866283RECENTLY_SOLD1221428824.03373.0sqft
44SINGLE_FAMILY8712.00275408051094.0-78.77602435.655872RECENTLY_SOLD1222767973.02861.0sqft
55SINGLE_FAMILY3484.80275026150004.0-78.84033035.732754RECENTLY_SOLD1222800863.02405.0sqft
66TOWNHOUSE2178.00276063950004.0-78.73339035.746270RECENTLY_SOLD1251672773.02337.0sqft
77LOT8712.002752680850NaN-78.77915035.606064RECENTLY_SOLD131712219NaNNaNsqft
88CONDO0.00276032488003.0-78.68429035.750446RECENTLY_SOLD1321836322.01192.0sqft
99CONDO0.00276072700003.0-78.69288035.813587RECENTLY_SOLD1321912282.0990.0sqft

Last rows

Unnamed: 0propertyTypelotAreaValueaddresspricebedroomslongitudelatitudelistingStatuszpidbathroomslivingArealotAreaUnit
15030TOWNHOUSE435.60276124100003.0-78.73997535.879017RECENTLY_SOLD65652793.02026.0sqft
15131SINGLE_FAMILY0.34275876250005.0-78.52834035.936184RECENTLY_SOLD675579814.04214.0acres
15232TOWNHOUSE2613.60276124410003.0-78.71959035.859844RECENTLY_SOLD683204773.01860.0sqft
15333TOWNHOUSE2178.00275606200003.0-78.82574035.807280RECENTLY_SOLD695632244.02866.0sqft
15434TOWNHOUSE1306.00276143600003.0-78.55886035.922450PENDING695659803.01728.0sqft
15535SINGLE_FAMILY7840.80275454050004.0-78.50233035.781654RECENTLY_SOLD814365653.02627.0sqft
15636SINGLE_FAMILY6098.40275395629204.0-78.82700035.691307RECENTLY_SOLD838937943.02545.0sqft
15737SINGLE_FAMILY4791.00275026600003.0-78.84257535.746210PENDING894949543.02680.0sqft
15838SINGLE_FAMILY9147.60275876100004.0-78.50350035.956623RECENTLY_SOLD895010553.02881.0sqft
15939SINGLE_FAMILY9147.60275405400004.0-78.87659535.634600RECENTLY_SOLD947234183.03041.0sqft